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A saddlepoint approximation of the Student's t-statistic was de- 
rived by Daniels and Young [Biometrika 78 (1991) 169-179] under 
the very stringent exponential moment condition that requires that 
the underlying density function go down at least as fast as a Normal 
density in the tails. This is a severe restriction on the approxima- 
tion's applicability. In this paper we show that this strong exponen- 
tial moment restriction can be completely dispensed with, that is, 
saddlepoint approximation of the Student's t-statistic remains valid 
without any moment condition. This confirms the folklore that the 
Student's t-statistic is robust against outliers. The saddlepoint ap- 
proximation not only provides a very accurate approximation for the 
Student's t-statistic, but it also can be applied much more widely in 
statistical inference. As a result, saddlepoint approximations should 
always be used whenever possible. Some numerical work will be given 
to illustrate these points. 

1. Introduction. In many statistical applications approximations to the 
probability that a random variable (r.v.), say T„, exceeds a certain thresh- 
old value are important since the exact distribution function (d.f.) of T„ 
may be very difficult or even impossible to obtain in most cases. Such ap- 
proximations are useful, for example, in constructing confidence intervals 
and in calculating p- values in hypothesis testing. In those circumstances, we 
are usually dealing with tail probabilities of the r.v., T„. Since these tail 
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probabilities are typically small, accurate approximations are particularly 
important. 

The "naive" method is to use the Normal approximation, which holds 
under mild conditions. However, this approximation is often too rough to 
be useful for small to moderate sample sizes. A more refined approximation 
is the Edgeworth expansion under some extra conditions. In general, the 
Edgeworth expansion improves the Normal approximation, but can still be 
inaccurate in the tails. 

To overcome the difficulties encountered by the Normal approximation 
and the Edgeworth expansion, one can consider using a saddlepoint approx- 
imation, which provides a very good approximation to the tail, as well as in 
the center of the distribution. By a "good" approximation, here, we imply 
one with a small relative error. By comparison, the Edgeworth expansion 
gives only absolute errors. However, when dealing with tail probabilities, the 
relative error behavior is more important than the absolute error behavior. 
For instance, an error of 0.005 is of little importance when considering tests 
of size 0.05, but is of great importance when considering tests of size 0.01. 
Put another way, if the true probability is 0.01, it is not of much help to 
know that the approximation has absolute error of size 0{n^^) when n is 
smaller than, say, 100. When, instead, the relative error is 0(n~^), we have 
a much more useful statement. It is quite common in statistical practice to 
consider test probabilities of the order of 1%, but even smaller probabilities 
are of interest in certain test situations. If, for example, one wishes to inves- 
tigate whether a chemical substance causes cancer, one will be interested in 
very small test probabilities to make a convincing case. In other fields, such 
as reliability, small probabilities are the rule rather than the exception. 

Saddlepoint approximations have been widely studied and used in many 
areas in recent years due to their excellent performance. For more details on 
the statistical importance and applications of saddlepoint approximations, 
one can refer to the books by Field and Ronchetti (1990), Kolassa (1997), 
Jensen (1995), Davison and Hinkley [(1997), Section 9.5] and to the excellent 
review paper by Reid (1988). All the literature clearly demonstrates how 
remarkably accurate the saddlepoint approximation can be. Accordingly, 
one should always use it if it is available. 

It is worth mentioning that the extreme accuracy of the saddlepoint ap- 
proximation is achieved at a cost of requiring a strong moment condition. 
Take the sample mean of independent and identically distributed (i.i.d.) 
r.v.'s, for example. It is known that asymptotic normality holds under the 
second moment condition, and that an r-term Edgeworth expansion is valid 
under the (r + 2)th moment condition plus some smoothness condition (e.g., 
a nonlattice or the Cramer condition). However, for the saddlepoint approxi- 
mation one needs the much stronger condition that the exponential moment 
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exists around the origin. This certainly hmits the apphcabihty of saddlepoint 
approximations in practice. 

In this paper we shall focus on the saddlepoint approximation of the 
Student's t-statistic. It is common knowledge that the Student's t-statistic 
plays a pivotal role in statistics and is the most widely used statistic in the 
inference of a population mean. Therefore, accurate approximations to its 
d.f.'s become particularly important. Toward this end, Daniels and Young 
(1991) derived a saddlepoint approximation for the Student's t-statistic. 
However, their conditions are far too strong to be useful in practice. They 
require that the exponential moment of the square of the underlying r.v.'s 
exists near the origin. In other words, the underlying tail probability of the 
r.v.'s needs to go to zero as fast as the Normal distribution does. This is 
indeed a very severe restriction and makes the approximation hardly useful 
in practice. Even the exponential distribution can not satisfy this condition. 

One of the purposes of this paper is to investigate how to weaken the 
strong moment condition given in Daniels and Young (1991) in the saddle- 
point approximation of the Student's t-statistic. One of the key findings of 
the paper is that this very strong exponential moment condition can be to- 
tally eliminated. This result is highly significant in statistical inference for 
two reasons: 

1. First of all, it makes the saddlepoint approximation more widely ap- 
plicable. It is known [Gine, Gotze, and Mason (1997)] that the Student's 
t-statistic is asymptotically A^(0, 1) if and only if the r.v. is in the domain 
of attraction of the Normal law and that it has an r-term (r > 1) Edge- 
worth expansion under the (r + 2)th moment condition plus some smooth- 
ness condition (e.g., the nonlattice or Cramer condition) [Hall (1987)]. Both 
asymptotic normality and Edgeworth expansion will not hold under heavy 
tail distributions, such as the Cauchy distribution. By contrast, this pa- 
per shows that the saddlepoint approximation does not need any moment 
condition at all and, at the same time, it provides an extremely accurate 
approximation to the tail probability of the Student's t-statistic. 

2. Second, the fact that no moment condition is required for the saddle- 
point approximation shows that the Student's t-statistic can guard against 
possible heavy tail distributions. This confirms the folklore that the Stu- 
dent's t-statistic is very robust against possible outliers. 

For these reasons, the saddlepoint approximation should always be used in 
practice whenever possible. 

The layout of the paper is as follows. Section 2 presents the formulation 
of the problem. Some notation and a brief review are given in Section 3. The 
main result will be presented in Section 4. Some numerical studies are given 
in Section 5. The proofs are given in Section 6. All technical details are left 
to the Appendix. 
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2. Formulation of the problem. Let {X, Xn, n > 1} be a sequence of i.i.d. 
nondegenerate r.v.'s with d.f. F{x). Write 



X 



1 " 



n ' , 



Now consider the Student's t-statistic 

n 

Tn := VnX/S, where 5^ = (n - 1)^^ ^{Xj -Xf for n>2. 

i=i 

It is known that asymptotic normality of T„ holds if and only if X is in 
the domain of attraction of the Normal law [Gine, Gotze and Mason (1997)], 
which implies that i^lXp"*" < oo for any e > 0. Hall (1987) showed that Tn 
has an r-term (r > 1) Edgeworth expansion under the (r + 2)th moment 
condition plus some smoothness condition (e.g., nonlattice or Cramer con- 
dition). On the other hand, Daniels and Young (1991) derived a saddlepoint 
approximation of Lugannani and Rice's (1980) type for the tail probability 
of Tn under the assumption that the joint moment generating function of X 
and exists, that is, 

(2.1) M(s, t) = e^p{K{s, t)} = Ee'^+^^^ < oo 

for {s,t)'^ in a neighborhood of the origin. However, condition (2.1) requires 
that the tail probability of the underlying d.f. drop to zero at least as fast 
as a Normal r.v. does. This is, indeed, a very restrictive requirement; for 
example, it is violated even for the Exponential distribution. This severely 
limits its applicability in statistical inference. The natural question is: "/s it 
possible to weaken the strong exponential moment condition and, if so, how 
far can we go?" 

Note that Tn is closely related to the so-called self-normalized sum defined 

by 

Sn /— X 
Vn=^^Tn 

where 
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To see this, we note the fohowing identity: 




Sn( n-1 



) 



1/2 



It suffices to investigate the self-normahzed sum, Sn/Vn, because of the 
following identity: 



There has been a growing literature on the study of self-normalized sums 
in recent years. For instance, one can refer to Logan, Mallows, Rice and 
Shepp (1973) for weak convergence, to Griffin and Kuelbs (1989, 1991) for 
a self-normalized law of the iterated logarithm, to Gine, Gotze and Mason 
(1997) for the necessary and sufficient condition for asymptotic normality, 
and to Wang and Jing (1999) for the exponential nonuniform Berry-Esseen 
bound under finite moment conditions. However, the work most relevant to 
the present paper is that by Shao (1997), who studied self- normalized large 
deviations. Among other results, Shao [(1997), Corollary 1.1] showed the 
following result. 

Theorem 2.1 [Shao (1997)]. Assume that either EX = or EX'^ = oo. 
Then for a; > 0, 



lim P[^>x] =supinf£;exp(t(cX-x(x2 + c2)/2)). 

Since for any r.v. X, either EX^ < oo or EX'^ = oo, the assumption that 
EX = is reasonable if EX^ < oo. In other words, the large deviation for 
the self- normalized sum in Theorem 2.1 holds without assuming any mo- 
ment conditions [see Remark 1.1 of Shao (1997)]. By contrast, a strong 
condition (2.1) is needed to derive the saddlepoint approximation for the 
self-normalized sum by Daniels and Young's approach, as noted earlier. This 
begs the question whether one can completely eliminate the condition (2.1) 
in the saddlepoint approximation of the self-normalized sum. The answer to 
this question is in the affirmative, as is shown later in the paper. 

3. Notation and brief review. In this section we shall introduce some 
notation that is used in later sections. We do this by briefly deriving saddle- 
point approximations of the self-normalized sum Sn/Vn under the strong ex- 
ponential moment condition (2.1), following similar lines to those in Daniels 
and Young (1991). 

The first step involves finding the saddlepoint approximations of the 



joint density of {X,Y)'^ , where Y = X"^, Yi = Xf for 1 < i < n and Y = 



(2.2) 





l/n 
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Z]r=i ^i- Assume that the cumulant-generating function of {X, X'^y sat- 
isfies 

(3.1) K{s,t) = hiM{s,t) = \nEe'^^^^^ < oo 

in a neighborhood of the origin. Denote 

dK{s,t) 



Ksis,t) :-- 
Kt{s,t) :-- 



ds ' 
dK{s,t) 

dt ■ 



, , d'^K(s,t) 
Kssis.t) := ^—^ and so on. 

Assume that {X,X'^)'^ has an integrable characteristic function. Then, by 
the Fourier inversion formula, the saddlepoint approximation to the joint 
density, fn{x,y), of (X^Y)'^ is given by 

(3.2) f^{x,y) = -^ 1 1 e-"[^-+*^-^(^'*Wdsdt = /„(x,y)(l + r„/n), 
where integration is along admissible paths in R^, and 

^ ^—nlsx+iy-K{s,t)] 



(3.3) fn{x,y) 



27T [Kss{s,i)Ku{s,i)-Kl{s,i)YI^' 
where s = s{x, y) and t = t{x, y) are solutions to 

(3.4) K,{s,i)=x, Kt{s,i)=y, 

and |r„| < C for some C > if {x,y)'^ is contained in a compact set. 

The second step is to find the joint density /(Xx/F )(^'^)- a = x, 
b = xj ^ (y > 0). The inverse transformation and its Jacobian determinant 
are 

(3.5) x = x{a,b):=a, y = y{a,b) := /b^ , J{a,b) = 2a^ /b^ . 
Thus, the saddlepoint approximation to the joint density of {X , X /Vn)'^ is 

where s = s{x{a, b),y{a, b)), t = t{x{a, b),y{a, b)), and 

A{a,b) = sx{a,b) + ty{a,b) - K{s,t) = sa + ia^/b^ - K{s,i), 
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where s and t satisfy Ks{s,t) = a and Kt{s,t) = a? /h'^. After some simple 
algebra, we obtain 

Aa(a,6) = s + -3-, 



Afe(a,6) 



62 



Aaa(a,6) = ^+(^l,-^jA(a,5)-i(^l,-^j . 

The third step involves finding the marginal density of XjV n- Let oq = 
ao(6) be such that 

A(ao,6) :=inf Aa(a,6). 

If we assume that A(ja(aO)6) > 0, then = flo(^) is the unique solution 
of Aa(ao, fe) = 0. Then the Laplace approximation of the marginal density of 
XlVn is 

^ \J{a^M „-nA(ao,6) 



^^/^"(') ^V2.det{A(ao,6)}V2Ay.^ao,6)' 

Finally, by applying another Laplace approximation in integrating fp^/y ^ {b) , 
we get the saddlepoint approximation for the self-normalized sum. We sum- 
marize the result in the following theorem. 

Theorem 3.1. Assume that: 

(CI) Ee'^^^^''^' G L^'iR'^) for some v>l, that is, / /|^e^«^+^''^V d^drj < 
oo. 

(C2) A,,(ao,6)>0. 

(C3) Ee'^^'^^^ < oo in a neighborhood of the origin. 
Then we have 

P(f>b)=l- HV^v.) - ^ (- - - + 0(n-)) , 
Wn J Jn \W V J 



where w = ^2A(ao, b) and v = — det{A(ao, b)}^/"^ hlJa {oq, b)tQ, and {sq, fo, «o) 
are solutions {s,t,a) to the equations 



2ta EXe'^+'^' EX^e'^+'^' 



(3.6) s + — = 0, -FT-YT7V^=a 



a 
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Remark 3.1. From the first equation of (3.6), we obtain that s = 
— 2ta/6^. Therefore, on substituting this into the other two equations, then 
(3.6) reduces to 

^j^gt{-2aX/fc2+x2) ^j^2gt{-2aX/fe2+x2) ^2 

^^''^^ ^gt(-2aX/b2+x2) = ^gt(-2aX/b2+X2) = 52 " 

4. Main results. The saddlepoint approximation for the Student's t- 
statistic under the strong exponential moment conditions was given in The- 
orem 3.1. Condition (CI) is a smoothness condition, which vahdates the 
Fourier inversion formula (3.2). It is satisfied, for instance, when the r.v. 
X has a density function. The main purpose of this paper is to remove 
conditions (C2) and (C3) in Theorem 3.1. 

Theorem 4.1. Let < 6 < 1 and let X be a r.v. with EX = or EX^ = 
oo. Assume further that condition (CI) in Theorem 3.1 holds. Then 

(4.1) pff >6) = l-*(,/i«,)-*<^('i-l + 0(„-)), 

where w and v are defined the same as in Theorem 3.1. 



We make the following remarks: 

1. When — 1 < 6 < 0, similarly, we have 

P(X/F„ <b) = HV^u^) - (---- + 0(n-)) . 

2. Theorem 4.1 remains valid when 6 = ibl. Take 6=1 for instance. From 
Proposition 6.1, condition (CI) implies that X is a continuous r.v. Then 
the left-hand side of (4.1) is 

P(X/Vn >b) = P(Xi = . . . = > 0) = 0. 

On the other hand, it can be shown that w = oo if b = l (see Remark ??), 
which implies that the right-hand side of (4.1) is also zero. 

3. The case for 6 = is slightly different. By the Berry-Esseen bound, we 
have 

P(X/Vn > 0) = P(X > 0) = i{l + 0(n"i/2)}, 

provided that i^lXj^ < co, which is the minimal moment condition re- 
quired here. Comparing this with Theorem 4.1, we notice that a stronger 
condition is needed for the case when 6 = than when b^O. It may seem 
odd that one needs stronger conditions in the middle of the distribution 
than in the tails. The reason is that when 6 = there is nothing to offset 
the effect of possibly heavy tail distributions. Therefore, one must impose 
extra conditions to control the tail behavior. 



SADDLEPOINT APPROXIMATION FOR T-STATISTIC 



9 



5. Numerical study. In this section we conduct some numerical studies 
to investigate the performance of the saddlepoint approximation for the Stu- 
dent's t-statistic. Let Xi, . . . ,Xn be a random sample from a distribution 
with p.d.f. f{x). We shall choose f{x) from several well-known density func- 
tions, ranging from one with very thin tails (e.g., Normal density) to one 
with rather heavy tails (e.g., Cauchy). 

Our interest is to calculate the probability of the self-normalized sum, 
P{X/Vn > b), for a range of values of 6 G (0,1). Since the exact value of 
the above probability is difficult to obtain in practice, we calculate its "ex- 
act" probability by 1,000,000 Monte Carlo simulations. Then, we compare 
how well the saddlepoint approximation performs in comparison with other 
approximation methods, such as the large deviation [Shao (1997)], the Edge- 
worth expansion [Hall (1987)], and the Normal approximation. 

For illustration purposes, we choose the sample size to be n = 5, since 
different sample sizes display similar patterns. In the tables below, we use 
the following abbreviations: 

"True" = true probability, 

"Saddle" = saddlepoint approximation, 

"Edgeworth" = Edgeworth expansion, 

"L.D." = large deviation, 

"N.A." = Normal approximation, 

"R.E." = relative error. 



5.1. Saddlepoint approximation vs. large deviation. Here we compare the 
saddlepoint approximation for self-normalized sums and large deviation re- 
sults of Shao (1997). Let Xi, . . . , Xn be a random sample from the Standard 
Normal distribution with p.d.f. 



^27r 

The reason for deliberately choosing this "nicest" density function is based 
on the belief that any approximation method should probably work at its 
"best" under this special situation if it works at all. In other words, if a 
method does not work well in this case, we cannot expect it to work well 
in other cases either. The simulation results are presented in Table 1 and 
Figure 1. 

We first make some general remarks. 

(i) First of all, the saddlepoint approximation provides extremely accu- 
rate approximations to the exact probabilities and performs uniformly better 
than the other approximation methods, even for sample sizes as small as 5. 
In fact. Figure 1 shows that the saddlepoint approximation is almost indis- 
tinguishable from the true probability. The superiority of the saddlepoint 
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approximation becomes even more pronounced in the tails of the distribu- 
tions. 

(ii) Since the sample is from a Normal distribution, the Normal approxi- 
mation and one-term Edgeworth expansion to P{X /Vn > b) coincide. Table 
1 shows that the Normal approximation gives very good approximation at 
the center of the distribution in this "nicest" case. However, the approxi- 
mation soon starts to deteriorate very quickly toward the tail area of the 
distribution. 

(iii) The large deviation performs miserably throughout the whole range. 
It is much worse than even the Normal approximation at the center of the 
distribution. In the tail area, the saddlepoint approximation is much superior 
to the large deviation. This shows that one can NOT rely on the large 
deviation to give accurate approximations of probabilities. 

This example clearly demonstrates that the large deviation is no sub- 
stitute for the saddlepoint approximation when it comes to accurate ap- 
proximations, even for a case as nice as the Normal distribution. The same 
phenomenon has also been found for other underlying d.f.'s. For this reason, 
we shall not include the large deviation in our simulation studies below. 

To see why the large deviation performs so poorly, we note that Theo- 
rem 2.1 gives the limit oi P{X /V n > fe)"*^/" as n ^ oo. However, [C„P(X/l/„ > 

Table 1 











fix) = {27T 




''^ (Normc 


d density) 






b 


True 


Saddle 


(R.E.) 


L.D. 


(R.E.) 


N.A. 


(R.E.) 





05 





4621 


0.4621 


(0.0001) 


0.9938 


(1.15) 


0.4555 


(0.01) 





10 





4243 


0.4244 


(0.0003) 


0.9752 


(1.30) 


0.4115 


(0.03) 





15 





3869 


0.3872 


(0.0007) 


0.9447 


(1.44) 


0.3687 


(0.05) 





20 





3500 


0.3505 


(0.001) 


0.9030 


(1.58) 


0.3274 


(0.06) 





25 





3138 


0.3146 


(0.003) 


0.8510 


(1.71) 


0.2881 


(0.08) 





30 





2785 


0.2797 


(0.004) 


0.7900 


(1.84) 


0.2512 


(0.10) 





35 





2443 


0.2460 


(0.007) 


0.7213 


(1.95) 


0.2169 


(0.11) 





40 





2113 


0.2136 


(0.01) 


0.6467 


(2.06) 


0.1855 


(0.12) 





45 





1799 


0.1829 


(0.02) 


0.5680 


(2.16) 


0.1572 


(0.13) 





50 





1502 


0.1539 


(0.02) 


0.4871 


(2.24) 


0.1318 


(0.12) 





55 





1225 


0.1268 


(0.04) 


0.4063 


(2.32) 


0.1094 


(0.11) 





60 





0970 


0.1019 


(0.05) 


0.3277 


(2.38) 


0.0899 


(0.07) 





65 





0739 


0.0793 


(0.07) 


0.2534 


(2.42) 


0.0731 


(0.01) 





70 





0536 


0.0592 


(0.10) 


0.1857 


(2.46) 


0.0588 


(0.10) 





75 





0363 


0.0417 


(0.15) 


0.1266 


(2.49) 


0.0468 


(0.29) 





80 





0223 


0.0271 


(0.22) 


0.0778 


(2.49) 


0.0368 


(0.65) 





85 





0116 


0.0154 


(0.33) 


0.0406 


(2.49) 


0.0287 


(1.46) 





90 





0045 


0.0070 


(0.53) 


0.0157 


(2.46) 


0.0221 


(3.86) 





95 





0009 


0.0018 


(1.03) 


0.0030 


(2.42) 


0.0168 


(18.4) 




Comparison of relative errors 



20 
18 ■ 
16 ■ 
14 ■ 
12 ■ 
10 ■ 

8 

6 

4 

2 

■ 



0.1 0.2 Q.3 0.4 0.5 0.6 0.7 08 0.9 

Fig. 1. Comparisons under the Normal density. 



■ - 'Saddle 
— L.D. 
■--N.A, 



would give the same limit as long as Cn — > 1- That is, the large de- 
viation only captures the exponential component and any other terms are 
simply thrown away. 

In a way, the relationship between the large deviation and the saddlepoint 
approximation is a little like that between the Normal approximation and 
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the Edgeworth expansion, since in both cases, the former provides the dom- 
inant term for the latter. One major difference is the following. The Normal 
approximation can be used in statistical inference when the sample size is 
reasonably large and the Edgeworth expansion can often provide more ac- 
curate approximations than the Normal approximation. However, one can 
not usually rely on large deviation probability to calculate tail probabili- 
ties in general since the approximations are often too crude to be useful, as 
shown in the last example. By contrast, the saddlepoint method can provide 
extremely accurate approximations throughout the range. 

5.2. Saddlepoint approximations for light tailed distributions. Here, we 
study the accuracy of the saddlepoint approximation to P{X /Vn > x) when 
the underlying distribution has thin tails. Let Xi, . . . , X„ be a random sam- 
ple from the centered exponential density with p.d.f. 

/(x) = e-(^+i\ x>-l. 

The tail of the density decreases exponentially fast (but not as fast as the 
Normal density function). As mentioned before, even for this "nice" density, 
the stringent exponential moment condition given by Daniels and Young 
(1991) is not satisfied. But the saddlepoint approximation still holds from 
Theorem 4.1. The Normal approximation and the Edgeworth expansion are 
included for comparison. The results are presented in Table 2 and Figure 2. 
We make the following observations. 

(i) The saddlepoint approximation is remarkably accurate and uniformly 
better than the other approximation methods. Most of the relative errors 
fall below 10%, and the maximum error is only 17% near the center of the 
distribution. 

(ii) The Edgeworth expansion performs better than the Normal approx- 
imation throughout the whole range. Both give reasonable approximations 
at the center, but they turn very bad toward the tail areas, where the rela- 
tive errors are of the order of 1000% for tail area probabilities in the order 
of 1%. By comparison, the errors for the saddlepoint approximation do not 
exceed 20% for the whole region. 

(iii) This example clearly demonstrates why accurate approximations of 
the tail area probabilities are important in statistical inference. It is easy to 
conceive of a hypothesis test such that its p- value is given by Pho{X /Vn > 
0.75), where the Xj's follow a centered exponential distribution under Hq. 
From Table 2, the true value is 0.0088 < 0.01, which leads to the rejection of 
Hq at significance level 1%. The same conclusion would be reached by using 
the saddlepoint approximation, but not by using the Normal approximation 
or the Edgeworth expansion. 
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5.3. Saddlepoint approximations for heavy tailed distributions. Here we 
are interested in the accuracy of the saddlepoint approximation for self- 
normahzed sums when the underlying distribution has heavy tails. We shall 
give two examples. 

Example 5.1. Let Xi, . . . ,Xn be a random sample from the t2 distri- 
bution with p.d.f. 



' 23/2(1 +x2/2)3/2" 

Clearly, EXi = and Var(Xi) = oo. Also, it is easy to check that Xi is in 
the domain of attraction of the Normal law. It then follows from Gine, Gotze 
and Mason (1997) that the Student's t-statistic is asymptotically A^(0,1). 
Clearly, the saddlepoint approximation still holds under this heavy tail dis- 
tribution, following Theorem 4.1. So, in this case, we can compare the sad- 
dlepoint approximation with the Normal approximation of the Student's 
t-statistic. The results are summarized in Table 3 and Figure 3. 



Table 2 

f{x) = e~''^'^^\ X > —1 (centered exponential density) 



b 


True 


Saddle 


(R.E.) 


Normal 


(R.E.) 


Edgeworth 


(R.E.) 


0.05 


0.4231 


0.4951 


(0.170) 


0.4602 


(0.09) 


0.4024 


(0.05) 


0.10 


0.3869 


0.4267 


(0.103) 


0.4207 


(0.09) 


0.3611 


(0.07) 


0.15 


0.3487 


0.3486 


(0.000) 


0.3821 


(0.10) 


0.3197 


(0.08) 


0.20 


0.3090 


0.3046 


(0.001) 


0.3446 


(0.12) 


0.2792 


(0.10) 


0.25 


0.2680 


0.2633 


(0.018) 


0.3085 


(0.15) 


0.2407 


(0.10) 


0.30 


0.2270 


0.2223 


(0.021) 


0.2743 


(0.21) 


0.2052 


(0.10) 


0.35 


0.1866 


0.1825 


(0.022) 


0.2420 


(0.20) 


0.1732 


(0.07) 


0.40 


0.1486 


0.1451 


(0.023) 


0.2119 


(0.43) 


0.1452 


(0.02) 


0.45 


0.1141 


0.1114 


(0.024) 


0.1841 


(0.61) 


0.1214 


(0.06) 


0.50 


0.0840 


0.0822 


(0.022) 


0.1587 


(0.89) 


0.1015 


(0.21) 


0.55 


0.0594 


0.0581 


(0.023) 


0.1357 


(1.28) 


0.0851 


(0.43) 


0.60 


0.0402 


0.0391 


(0.028) 


0.1151 


(1.86) 


0.0717 


(0.78) 


0.65 


0.0256 


0.0250 


(0.026) 


0.0968 


(2.77) 


0.0608 


(1.37) 


0.70 


0.0156 


0.0151 


(0.033) 


0.0808 


(4.19) 


0.0517 


(2.32) 


0.75 


0.0088 


0.0085 


(0.039) 


0.0668 


(6.59) 


0.0441 


(4.01) 


0.80 


0.0045 


0.0044 


(0.029) 


0.0548 


(11.15) 


0.0376 


(7.34) 


0.85 


0.0021 


0.0020 


(0.031) 


0.0446 


(20.43) 


0.0319 


(14.34) 


0.90 


0.0008 


0.00075 


(0.064) 


0.0359 


(43.86) 


0.0269 


(32.57) 
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Comparison of probabilities 
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Comparison of relative errors 



0.4 
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Fig. 2. Comparisons under exponential density. 



- - .Saddle 

N.A. 

Edgeworth 



Example 5.2. Let Xi, . . . ,Xn be a random sample from the Cauchy 
distribution with p.d.f. 



fix) 



1 



7r(l + x2) 



2\- 
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Note that the usual Normal approximation and Edgeworth expansion do not 
exist here. However, the saddlepoint approximation continues to hold here. 
The results are given in Table 4 and Figure 4. 

We make some remarks about the two examples. 

(i) Clearly, the saddlepoint approximation is remarkably accurate even 
for these rather heavy tail distributions. The relative errors remain very 
small (under 11% and 13%, resp.) for the range considered. 

(ii) For the t2 density case, asymptotic normality holds and the Normal 
approximation performs rather well in the center, but it becomes very poor 
toward the tail area. In fact, the relative errors start to shoot up just as the 
tail probability decreases to around 5% and beyond, which is the area of 



Table 3 

f{x) = 2-^^^{l+x^/2)-^^^ (t2 density) 





b 


True 


Saddle 


(R.E.) 


N.A. 


(R.E.) 





40 





2386 


0.2637 


(0.105) 


0.2119 


(0 


11) 





45 





1987 


0.2146 


(0.080) 


0.1841 


(0 


07) 





50 





1598 


0.1708 


(0.069) 


0.1587 


(0 


01) 





55 





1255 


0.1322 


(0.053) 


0.1357 


(0 


08) 





60 





0953 


0.0990 


(0.040) 


0.1151 


(0 


21) 





65 





0694 


0.0713 


(0.027) 


0.0968 


(0 


39) 





70 





0479 


0.0488 


(0.019) 


0.0808 


(0 


69) 





75 





0310 


0.0312 


(0.007) 


0.0668 


(1 


15) 





80 





0183 


0.0182 


(0.006) 


0.0548 


(2 


00) 





85 





0094 


0.0093 


(0.019) 


0.0446 


(3 


72) 





90 





0038 


0.0036 


(0.056) 


0.0359 


(8 


34) 



Table 4 

f [x) = -K^^ {1 + x'^)~^ (Cauchy density) 



b 


True 


Saddle 


(R.E.) 


0.40 


0.2712 





3058 


(0.13) 


0.45 


0.2085 





2302 


(0.10) 


0.50 


0.1515 





1697 


(0.12) 


0.55 


0.1117 





1218 


(0.09) 


0.60 


0.0798 





0845 


(0.06) 


0.65 


0.0537 





0563 


(0.05) 


0.70 


0.0344 





0356 


(0.04) 


0.75 


0.0207 





0210 


(0.02) 


0.80 


0.0112 





0113 


(0.01) 


0.85 


0.0052 





0052 


(0.00) 


0.90 


0.0019 





0019 


(0.02) 
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Comparison of relative errors 



0.4 



0.5 0.7 O.B 0.9 

Fig. 3. Comparisons under the t2 density. 



- - Saddle 
N.A. 



most interest in statistical inference. The plot of relative errors in Figure 3 
should leave all doubts behind. 

(iii) We have seen that the saddlepoint approximation provides extremely 
accurate approximation of the distribution of the self- normalized sum or, 
equivalently, of the Student's t-statistic, particularly near the tail area. It 
is also clear that the tail probability of the Student's t-statistic decreases 
exponentially fast. These properties hold irrespective of whether the under- 
lying density has light or heavy tails. These results confirm the common 
belief that the Student's t-statistic provides a very robust procedure for the 
statistical inference of a population mean with a possible heavy-tailed dis- 




Relative error of saddlepoint approx. 



0.14 



0.12 



0.1 



0.08 



0.06 



0.04 



0.02 







0.4 0.5 0.6 0.7 0.8 0.9 

Fig. 4. Comparisons under the Cauchy density. 



■ Saddle 



tribution. On the other hand, it is well known that the sample mean is very 
sensitive to outliers and is not robust against heavy-tailed distributions. 

(iv) Robustness of the self-normalized sums or, equivalently, the Student's 
t-statistic, can also be explained intuitively as follows. It is well known that 
when there is an outlier on the right-hand side among the observations 
Xi, . . . ,Xn, the sample mean, X, is dominated by the largest order statis- 
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tic, -'^(n) = maxjXi, . . . For self-normalized sums, X /Vn-, both X and 

Vn are dominated by -'^(n) i effectively cancelling the influence of any outlier. 

5.4. Summary. The Student's t-statistic is one of the most commonly 
used statistics in inference. We have derived a saddlepoint approximation 
for the Student's t-statistic under no moment condition. The key results are 
summarized as follows. 

1. The saddlepoint approximation provides extremely accurate approxima- 
tions to the distribution of the Student's t-statistic. The approximation 
is particularly useful in calculating small probabilities in the tail areas, 
which are often of great interest in practice. 

2. The saddlepoint approximation holds under no moment condition. This 
makes the application of the saddlepoint approximation very broad. This 
is significant for the user since one can use the approximation without 
having to worry about whether or not the result is valid. 

3. The Student's t-statistic is very robust against possible outliers. 

For those reasons, the saddlepoint approximation of the Student's t-statistic 
should always be used in practice whenever possible. 

6. Proof of Theorem 4.1. The immediate consequence of condition (CI) 
is as follows. 

Proposition 6.1. F{x) is a continuous d.f. under condition (CI) of 
Theorem 3.1. 

Proof. Let 2u be the smallest even integer not less than v. Then 

E^i((Xi+-+X^-X^+-, X2u)+iv{Xi+-+Xl-Xl^^ X|J 

By the Fourier inversion theorem in [e.g., see (7.14) of Feller (1971)], 

{Xi H \- Xu — Xu+i — X2u,Xf H \- X^— — • • • — X2y) has 

a bounded continuous density, which implies that F(x) is a continuous d.f. 
□ 

The key to getting rid of condition (C2) is the following. 

Proposition 6.2. Assume that F{x) is a continuous d.f. Then for each 
fixed b G (0,1), infa>oA(a, 6) is attained at some finite unique point, oq := 
oo(&), which is the solution to Aa{a,b) = 0. 
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Proof. The proof fohows from Lemmas A. 6 and A. 7. □ 

In an effort to remove condition (C3), we shall give the following two 
propositions. 

Proposition 6.3. Under the conditions of Theorem 2.1, we have 
lim P(X/Vn > hf/'' = supinf Eexp(t(aX - hiX'^ + a^)l2)) 



Proof. The first equality follows from Theorem 1.1 of Shao (1997). The 
second one follows since 

logf supinf ^exp{t(aA: - b{X^ + a^)/2)} 

Va>0*>0 



- inf sup( -tba'^ - log E exp{t{aX - bX'^/2)}) 

a>0t>0\2 J 

- inf sup (-tio^ - logE'expjti (^-y^ + 



a>Oti<0 

(where ti = -tb/2) 

inf sup [ — — K [ — rT"^i'*i) ) (where a = ai/6) 



ai>Oj,<oV b"^ \ b"^ 
= - inf A(a, 6), [by (A.2) and Lemma A.7]. 

a>0 

□ 

The proposition establishes the relationship between the saddlepoint ap- 
proximation formula of Theorem 3.1 and the large deviation results of Theo- 
rem 2.1. It shows that the dominant term in the saddlepoint approximation 
given in Theorem 3.1 is the same as that in the large deviation of Shao 
(1997). Since the latter requires no moment conditions at all, it is therefore 
reasonable to expect that Theorem 3.1 holds under no moment conditions as 
well. Unfortunately, the techniques used in Shao (1997) cannot be employed 
here for our purposes. One crucial result is the following. 

Proposition 6.4. Assume that F{x) is a continuous d.f. Then, for 
<b <1, there exist solutions {so,to,ao) in {3.6) such that sq > 0, io < 
and oq > 0. 
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Proof. The proof follows straightaway from Lemmas A. 3, A. 6 and Re- 
mark 3.1. □ 

The critical observation here is that tQ <0, which implies that the cumu- 
lant generating function, K{s,t) =lnEe^-^~^^^ , always exists for {s,t)'^ in 
a small neighborhood of (so,to)"^ by the continuity of K{s,t). This suggests 
that, in order to derive self-normalized saddlepoint approximations without 
moment conditions, we need to divide the probability, P{X/Vn > into 
two regions: 

(i) a small neighborhood of (so,to)^ for which to < 0, where we need to 
show that there exists a saddlepoint approximation without any moment 
conditions; 

(ii) the remaining region outside this small neighborhood of (so,io)"^) 
where we need to show that the probability is "negligible." 

To make these statements precise, define 
n{b) = {{x,yY\h<x/^<l], 

noib) = {{x, yf\ {x - aof + {y - al/b^f < e^} n 17(6), 

ni{b) = n{b)\no{b). 

The closure of an arbitrary set. A, will be denoted as A~ . The plots of these 
regions are illustrated in Figure 5. 
Hence, for any < 6 < 1, 



P{X > bVn) = J J^^^^ ffjy-^ (x, y) dx dy 

(6.1) = J J^^^^^f^j^y^{x,y)dxdy + Pi(X,Yf eniib)) 
:= Ji{b) + J2{b). 

Thus, the proof of Theorem 4.1 follows from the next two propositions. 
Proposition 6.5. Under the conditions of Theorem 4.1 we have 

(6.2) Mb) = 1 - HV^u.) - (1 - i + 0(n-)) , 
where w and v are defined the same as in Theorem 3.1. 



Since hi{-^io,io;ao,b) = 0, /i2(-^io, *o; ao, &) = and 



Proof. Denote hi{s,t;a,b) = Ks{s,t)-a, h2{s,t;a,b) = Kt{s,t)-a^ /b'^ . 
'f^to,to;ao,6) = 0, /i2(-^ ' 
/ dhi dh2\ 
ds ds 



dhi dh2 
y~dr ~dt ^ 



(s,t,a)={-{2ao)/b'^to,to,ao) 
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Fig. 5. Partition of the area of integration. 



y = X 



> X 



is positive definite, it follows from the implicit function theorem that there 
exists e > such that si = s{a,bi) and ti = t{a,bi) are difFerentiable func- 
tions of a and bi when {a,a^ /b\)'^ S ^o{b) for any < 6 < 1, where si and 
ti are solutions to the equations Kg{s,t) = a,Kt{s,t) = cP' jb\. Since to < 0, 
we can always choose e to be so small that t\ < 0. 

Using the transformation (3.5) and the saddlepoint approximation for 



Ji(6) 



no(fe) V n 



(6.3) 



(a,a2/b2)T£QQ(fe) 



/„(x(a, &), 2/(a, b)) dxdy[\ + OirT^)) 
n exp{— nA(a, &)} ^, , „ ,^ „, 



/b2)TgQo(fe) 27r det{A(a, 6)}^ 

|-ao+<52 exp{-nA(a,6)} 
2^det{A(a,6)}i/2 



J(a,6) dad6(l + 0(n~^)) 
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where |r„| < C since ilo(^) is compact and 5i and 82 are small positive 
numbers such that 

if a G [oo — (52,ao + ^2] and 6 + then (a, G r2o(6). 

By Proposition 6.2, applying the Laplace approximation to the inner integral 
of (6.3) w.r.t. a gives 



Ji(6) 



^+'^1 r^exp{-nA(ao(6),6)} J{aoib),b) 



b y 27r det{A{ao{b),b)y/^ Ali\ao{b),b) 



(6.4) X i^l + ^jdbil + 0{n-^)) 

_ ^ exp{-nA(ao(b),b)} J(ao(6),6) , 

- A V 2. det{A(ao(6), 6)}V2 Ai/^(ao(6), b) ^ ^ 

where |ri„| is uniformly bounded in [6, 6 + 

From Lemma A. 8, A{a(){b),b) is a strictly increasing function of b in the 
neighborhood of b. Define 



w = w{b) = ^2A{ao{b),b), 

, det{A(ao, 6)}V2A,(ao, b)A'Ja\ao, b) 

^ " ^^'^ = ^(^^^;^^ • 

Noting that 

-irA,(ao,6) + A.(ao,6).''"°(')^ 



db w \ db J w 

we have 

(6.5) Ji(6)=/ J-^ fiz/;(l + 0(n-i)), 

where w = w{b) and wi = w{b + 5i). Write v = v{b). Applying the Laplace 
approximation to the second integral of the following equality, we get 

Ji(b) = r ^e-'""''Uw(l + 0(>i-i)) 

(6.6) = ^V^wi) - <^>{V^w) - ^iv^ (--- + 0{n~^)] 

\jn \w V J 

= (1 - H^w)){l + 0(n"i)) - + 0{n~^)\ 
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= 1 - <^>{^/nw) - — - - + 0(n ^) , 

where, in going from the second-to-the-last to the last line, we used 1 — 
^>(x) ~ 4>{x)/x as X — > oo. Replacing b by 6, we get the desired result. □ 

Proposition 6.6. Under the conditions of Theorem 4.1, 
(6.7) J2{b)/Ji{b) = o(n-"') for any m > 0. 

Proof. By Lemma A. 8, A(ao(6),6) is a strictly increasing function of b 
for b G (0, 1). Therefore, applying Laplace approximations to (6.4) again, we 
have 

Cinexp[— nA(oo(6), 6)] < Ji{b) < C2nexp[— nA(ao(6), 6)] 

where < Ci < C2 < oo. 
The proposition then follows from this and Lemma A. 9. □ 

Finally, Theorem 4.1 follows from (6.1), (6.6) and (6.7). 

APPENDIX: SOME USEFUL LEMMAS 

From here on, let X be a r.v. with EX = or EX^ = 00. We shall also 
adopt the same notation from Section 3. Write 

I{s, t; a, b) = sa + ta^/b'^ - K{s, t). 

We now give our first lemma. 

Lemma A.l. For fixed a and b, we have 

A(a, b) = sup/(s, t; a, b). 

s,t 

When no solutions to dl{s,t;a,b)/ds = dl{s,t;a,b)/dt = exist, we define 
A(a, b) = 00. 

Proof. It is easy to see that, for fixed a and b, I{s,t;a,b) is a concave 
function of s and t and it is differentiable for any (s,t)^ G interior (G), where 

e = {e = {s, tf : K{s, t) = liiEe'^+'^' < 00}. 

Therefore, 

sup/(s, t; a, b) = sa + ia^ /b^ — K{s,i) = A(a, b), 

s,t 

where s = s{x, y) and t = t{x, y) are solutions to 
(A.l) Ks{s,i) = a, Kt{s,i) = a'/b', 
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whenever the solutions exist. When no solutions exist, then clearly we have 
sup^ J I{s, t; a, b) = oo. The proof is complete. □ 

From Theorem 3.1 and Lemma A.l, we see that the saddlepoint approx- 
imation of the self-normalized sum involves finding, for fixed b, 

A(ao, b) := inf A(a, b) = inf sup/(s, t; a, b) = I{so, tQ;ao,b), 

°- °- s,t 

where sq, to and ao satisfy (3.6). In particular, we notice that the point 
{so,io,ao)'^ falls on the curve sq = —2aoio/b'^. This motivates the following 
definition: 

(A.2) g{t, a; b) = I{s, t; a, 6)|,=„2at/fe2 = -ta^/b^ - K{-2at/b^ ,t). 

Also note that the domain of a in the above infimum can be reduced to 
{a:ab > 0} because of the transformation a = x and b = x/^/y. Since we 
only consider the case < 6 < 1, from now on we can suppose a > 0. 
Let Cs denote the support of the r.v. X, that is, 

Cs = {x:P{X £ {x- £,x + e))>0 for any e > 0}. 

Clearly, Cg must be closed. We further use Card(Cs) to denote the number 
of elements in Cs and define Card(Cs) = oo if does not contain a finite 
number of elements. 



Lemma A. 2. Assume Card(Cs) > 3. Then g{t,a; b) is strictly decreasing 
in t for t G (— eo, oo) for some Eq > 0. 

Proof. If suffices to show that g{t,a;b) is strictly decreasing in t, ei- 
ther: 

(I) for t G [0,cx3), or 
(II) for t G (—£0,0] for some Eq > 0. 

We shall prove (I) first. Let Z = —2aX/b'^ + X'^. For arbitrary t and ti such 
that < t < ti, we need to show that g{t, a; b) > g{ti,a; b). 11 Ee^^^ = co, then 
g{ti,a;b) = —a'^ti/b'^ — hiEe^^^ = — oo, in which case (I) follows straight- 
away. Now, assume that Ee^^^ < oo below, which implies that moments of 
X of all orders exist. Thus, g{t,a;b) is differentiable in t for t G (— oo,ti). 
Taking derivatives gives 

dg{t,a;b) EZe^^ 



(A.3) 
Observe that 



dt 62 ^gtz 



dg{t,a;b) _ ^^2^3 



dt 
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and 



5t2 V ^e*^ V Eetz 

since Z = X'^ — 2aX/b'^ is nondegenerate by the assumption that Card(Cs) > 

3. Thus, < when t £ [0,ti). So g{t,a;b) is strictly decreasing in 

[0,ti). Since ti is arbitrary, we have hence proved (I). 

We shaU prove (II) next. If there exists some ^2 > such that Ee^"^^ < co, 
then (II) follows from the fact that = -a^/b'^ - EX^ < 0. It remains 

to prove (II) under the condition that 

Ee^^^ = oo for all ta > 0. 

To show this, we choose an arbitrary t <0. Then, from (A. 3) we have 

dgjt, a; b) S-o.j-'^a^ /h^ + x^)e'^--l^')' dF{x) 

^ ' dt 62 /^^e*(-«/''')'dF(x) 

By the monotone convergence theorem we have 
lim r e*(^-'^/*')' dF{x) = 1, 

^0- J -co 

lim_ r x^e*^^-''/*')' dF{x) = EX^ (maybe oo), 



t->0 

(A.6) 



where t ^ 0~ means that t — > from the left side of 0. 

IfE^I^I < c«, then noting |j;e*'^^~'^/''^^^| < for t < 0, we can use Lebesgue's 
dominated convergence theorem to get 

(A.7) lim r (-'^x)e'''''-''/''"y'dF{x) = -^EX = 0. 

t-»o-J-oo\ b-^ J b^ 

If E\X\ = oo (hence EX^ = oo), then 

^hm 1" (^-'^:, + :r^y<-~-/^')'dFix) 

> r(-^-- + -'V^~''^'^''dFix) 



(A. 



= oo. 

Combining (A.5)-(A.8) gives 

lim < 0. 

t-tO- at 

Note that g{t,a;b) is left continuous at t = 0. We conclude (II). □ 
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Lemma A. 3. Assume that F{x) is a continuous d.f. For each fixed b€ 
(0, 1) and a£ R, we have 

(A. 9) supg{t,a;b) = supg{t,a;b), 

t&R t<o 

and the supremum is either attained at some finite unique point, t := t{a, b) < 
0, or is simply infinity. 

Proof. Define h{x) := x"^ - 2ax/b^ + = (x - ai){x - 02), where 
oio := aio(a) = -^(1 - Vl - 6^), 

020 := 020(0) = -^(1 + Vl - fe^), 
(A.IO) " 

ai := ai(a) = min(aio, 020), 

02 := 02(0) = max(aio,02o)- 

Consider the fohowing two cases: 

(I') (01,02) nC7,/0, 
(II') (01,02) nC7, = 0. 

First suppose that (I') holds. Then there must exist W := [03, 04] C (oi, 02) 
so that: 

(i) there exists 5 > such that h[x) < —5 for each x S W; 

(ii) P{XeW)>0. 

Then we have, as t — > —00, 

g{t,a;b) = -hi H e*^(^)dF(x) < - In / e*'^^^) dF(x) 
J-00 Jw 

< - In / e-^^ dF{x) =t5-\n P{X € W) 
Jw 

— > —00. 

From Lemma A. 2, supjg^g(t, o; 6) is attained at some finite t = t{a,b) < 0. 

Since g{t, a; b) is a differentiable function of t when t < 0, we have = 0. 

This, together with (A. 4), implies that there is at most one solution to the 
equation = 0. Therefore, i is also unique. 

Next suppose (II') holds. Since Cg is necessarily closed, then [01,02] n Cs 
contains at most two points, {01,02}. Clearly, we have: 

(i) h{x) > for each x £Cs\ {01,02}; 

(ii) P(X gC,\ {01,02}) >0, 
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where (ii) follows since F{x) is a continuous d.f. Therefore, as t — > — oo, we 
have 

/oo r 
gth(x) = - In / e*''^^) dF{x) ^ oo. ^ 

Remark A.l. Lemma A. 3 also holds true for 6 > 1, in which case both 
sides of (A. 9) are equal to infinity. 

Lemma A. 4. For < 6 < 1, define 

U = {a:{ai{a),a2{a))nCs^0}, 

where ai{a) and 02(0) are defined in (A. 10). Then, if F{x) is a continuous 
d.fi: 

1. U is an open set and U ^ 0, so does U H , where = {x:x > 0}. 

2. When a£ U , then g{t{a, b),a; b) = sup^^-Q g{t, a; b) < 00, where t = t{a, b) < 

is a finite unique solution to the equation ^^^g'"'^^ = 0. 

3. When a^U, then sup^^Q g{t, a; 6) = 00. 

4. infa>osupjgR5'(t,a;6) =inf„g^nfl+suPi<off(t,a;6). 

Proof. We only prove 1 since 2-4 follow easily from the proof of Lemma A. 3. 

First, the claim that U ^ can be easily seen from the fact that Ua{^ • 
R. Second, we shall show that U is open, which is equivalent to showing that 
the complement of U, 

Uo = {a:{ai{a),a2{a)) nCs = 0}, 

is a closed set. To show this, for any fixed a' G C/q, then (ai(a'), 02(0')) {Z! Cg, 
or (ai(a'), 02(0')) ^s, the complement of Cs- Let V{a') be the largest 
interval such that (ai(a'), 02(0')) C V{a') C Cg- For simplicity, assume that 
a' > (the cases for a' = and a' < can be treated similarly). Since Cg 
is open, then V{a') must be open as well. Write V{a') = {co,do), where the 
endpoints could be 00 or —00. Write 



ac{a') :-- 
ad{a') :-- 



62 

do{l-Vl^) 
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It is easy to see that the closed interval [ac{a'),ad{a')] will be the largest 
subset of Uq including a' . Furthermore, for any a' 7^ a", the two intervals 
[oc(a'), ad(a')] and [ac(a"); «d(a")] either coincide or are nonover lapping. 
Therefore, 

a'eR 

which is closed. The proof is complete. □ 
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Lemma A. 5. ForO<b<l: 

1. lima^oo supt<o 9{i, a;b) = oo, lima_-,o+ sup^^o 9{t, o; b) = oo; 

2. lima-*ooSup^g^^tgj^/(s,t;a,6) = oo, lima_o+ sup^g^^^gj^ /(s, t; a, 6) = cxd, 

where a — > 0+ means that a goes to from the right side. 
Proof. Let k he & positive number. Then 
snpg{t,a;b) > g(-^,a;b 

(A.n) =^_,n£e.p{4(.^-|.)}.FW 

k 

:= -lnM(a). 

It follows from Lebesgue's dominated convergence theorem that 
(A.12) lim M(a) = 1, lim M(a) = 0. 

Combining (A.ll) and (A.12) gives 

k 

liminf sup (7(t, a; 6) > -75- , 
t<o 

liminf sup5(t, a; 6) = 00. 

Since k can be arbitrarily large, we have proved 1. 

From (A. 2) we have sup^gjij /(s, t; a, b) > sup^^o '^5 b) . This, together 
with 1 above, implies that 

lim sup I{s,t;a,b) = 00, 
lim sup I{s,t;a,b) = 00, 

a-*0+ seR,t£R 

which completes the proof of 2. □ 

Lemma A. 6. Assume that F(x) is a continuous d.f. and that < 5 < 1. 
Then infa>osupjg^5'(t,a;6) is attained at some finite unique point, {a,t)'^ = 
(00,^0)"^) where oq > 0, fo ■='t{O'0,b) < and they satisfy (3.7). 

Proof. It follows from Lemmas A.3-A.5 that infa>osup(g^g'(t,a;6) is 
attained at some finite points oq gU and to '■= ^(ooi b) < 0. When a € C/, by 
part 2 of Lemma A. 4, we have 

, dg(i,a;b) -EZe^^ , ^ 2a 

(A.13) ^^ = -^-^ = 0> where Z = --^X + X^ 
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By the assumption that F{x) is a continuous d.f., which imphes that Z 
is nondegenerate, (A. 4) is true. It then fohows from the imphcit function 
theorem that t(a,b) is a differentiable function in some neighborhood U*{a) 
of a (also a differentiable function in some neighborhood of b). We can 
also guarantee that U*{a) C U. Hence supf^jig{t,a;b) is also a differentiable 

function in some neighborhood of a^. Thus qq satisfies the equation ^^^^^'^^ = 
0, that is, 

(A.14) EXexpl^i(^-'^X + X^^^ =aEexpl^i(^-^X + X'^^ 

It follows from (A. 13) and (A.14) that ao and to are the solutions to the 
equations 

EZe*^ = -^Ee'^, 

EXe*^ = aEe*^, 

which are equivalent to (3.7) or (3.6). 

Now we show the uniqueness of (ao,io)"^- Suppose (aQjig)-^ is another 
point such that g{tQ,aQ;b) = mia>osupffzfj g{t,a;b). Note that 

g{t, a;b) = - log E expjt + X^ + | . 



We must have 



2ao , ^^2 , '^0 M rr. f ^ 2^ , ia2 , '^^ 



^exp<;io( --^X + X^ + -^j| =sup^exp|to(^--^^ + ^ 



(A.15) 



>i^exp{£o(-^X + X^ + | 



/2 



= Eexp{to{-'-^X + X^ + ^ 
If to 7^ ^0' then 

(A.16)i.exp{to(-^X + + I) } > inf i.exp{t(-^X + + |) } 

by the fact that E expit^-laX/b"^ + X^ + /b'^)} is a strictly convex function 
of t for each fixed a and — fyX + X^ + |^ is not identically equal to 0. 
Combining (A.15) and (A. 16), we get 

^-p{to (-^X + + I) } > ^-P{^"0 (-^^ + + I 
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which contradicts our assumption. Hence to = io- 

Next we show that sq — Sg . Define /(a, s) = Eexp{s{-2X/(ab'^) + X^/a^ + 
1/6^)}. Note that f{a,s) is a strictly convex function of s for each fixed a. 
Thus, we have 

/(ao,so) = /(ao,So) = sup inf /(a,s), 

a>0«<0 

where sq = to«o ~ ^'o^'o- Similar to the proof of Iq = above, we can 

show that So = Sq. Hence ao = cl'q. This completes the proof of uniqueness. 
□ 



The next lemma establishes the relationship between I(s, t; a, b) and g{t, a; b). 

Lemma A. 7. Assume that F{x) is a continuous d.f. Then, for < 6 < 1, 
inf supg(t, a; 6) = inf sup (7(t, a; 6) = inf sup /(s, t; a, 6) = inf A(a, 6). 

a>0 t<0 a>0 f<0 a>0^g^ tg/j a>0 

Proof. The first equality holds since g{t,a;b) is strictly decreasing as 
t — > 0~ by Lemma A. 2, and snpf^Qg{t,0;b) = oo. We shall now prove the 
second equality. Prom (A. 2) we have 

(A. 17) inf sup I{s,t;a,b) > inf supg{t,a;b). 

Prom 2 of Lemma A. 5 we see that infa>o sup^g^^^g^ I{s, t; a, b) is attained at 
some finite a > 0. By Lemma A. 6, \nia>Q'snpi^Qg{t,a;b) is also attained at 
some ao > and to < satisfying equation (3.7), namely, 

(A.18) Ks{-2aQio/b^M) = ao, K^(-2ao^o/^^ ^o) = a^/b^. 

Therefore, 

inf sup I{s,t;a,b) 

= sup I{s,t;a,b) 

< sup I{s,t;ao,b) 

= sup {soq + taQ/b'^ — K{s,t)} 

seR,teR 

= {sao + tal/b^ - A-(s,t)}|,^,^ ,^£^ 



[where Ks{so,io) = ao, Kt{so,io) = Oo/^^] 

s=-2aoio/b'^,t=io 



{sao + tal/b^ - i^(s,t)}U_2,„r ,=r [by (A.18)] 
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= -ioal/h^ - i^(-2ao^o/fc^^o) 

= c/(to,oo;6) 

= inf sup5(t, a; h). 

a>0 t<Q 

The lemma thus foUows from this and (A. 17). □ 

Lemma A. 8. Assume that F{x) is a continuous d.f. Then, for <b <1, 
infa>o supj^o di^y ^7 b) is a strictly increasing function of b. 

Proof. Regard g(t{a,b),a;b) as a joint function of a and b. Then ^^'^ qi,^' ' ^ \a= 
— 2°o*^(°o.fe) ^ that is, g{i{aQ,b),aQ;b) is a strictly increasing function of b 
in a small neighborhood of b. If bi < 62 and 5i is sufficiently close to 62, we 
have 

(A. 19) g{i{ai,bi),ai;bi) < g{t{a2M),a2]bi) < 5'(i(«2, ^2), 02; ^2), 

where ai and 02 satisfy (7(^(01, 61), ai; 61) = infa>o g(t{<i, 61), a; 5i) and 5(^(02, ^2), 02 
= infa>o 9{t{a-, ^2), a; ^2), respectively. Lemma A. 7 and Proposition 6.3 imply 
that infa>o sup^^g 9{^-> o-'-, b) is a nondecreasing function of 6, which, combined 
with (A. 19) holding under the condition that bi < 62 and bi is sufficiently 
close to 62, proves Lemma A. 8. □ 

Lemma A. 9. Assume that F{x) is a continuous d.f. Then, for <b <1, 
e > and m > 0, we have 

P((X,F')^G(Oi(6))-)/exp(-n5(to,ao;&)) = o(n-™). 
Proof. From Corollary 1.1 of Dembo and Shao (1998), we get 
limsup-lnP((X,F^)^G (Oi(6))-) 

n— ♦oo n 

<- inf sup/(s,t;a,6i) =: -/min, 

{a,a2/b?)^e{ni(6))- s,t 

if the condition (1.12) in Dembo and Shao (1998) holds, which is clearly true 
since 

if 

y^oo,{x,y)T(^(ni{b))- y 

Hence, for all 6i > 0, there exists ni such that if n > ni, 
(A.20) -lnP((X,F')^ G {Mb)r) < -Imin + ^. 



liminf — = 6^ > 
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From (A. 2) we have < supf^^^ g{t,a;bi) < supgj-I{s,t;a,bi) for any bi, 
which imphes that 

(A.21) -/min<- inf sup g(t, a; bi). 

(a,a2/6?re{ni(fe))- t<0 



Define 



5i= inf sup g{t, a;bi) - g {to, ao;b). 

{a,aybl)Te{ni{b))- t<0 



We shall now show that 5i > 0. Similar to Lemma A. 5, we can show that if 
b<b' <1, then 

lim q(i, a; bi) = oo, 

a-*0+,bi~*b' 

lim g(t, a; bi) = oo. 

a^oo,bi^b' 

Hence, inf^^ jj2/^2^Tg(Qj(^))- 5f(i, a; 6i) is attained at some finite qe > and 

b<bE<^- By Lemma A. 8 g{tQ,ao;bi) =mia>osup)-^Qg(t,a;bi) is a strictly 
increasing function of 6i. If bE > b, we have 

inf supg(t,a;bi) 

{iaMV--ia,aybl)Te{ni{b))-} t<0 

= inf g(i(a,bi),a;bi) 

{{a,biV ■■ {a,aybl)Te{ni{b))~} 

= g{i{aE,bE),aE;bE) 

> lui g{i{a,bE),a-bE) 

a>0 

> inf g{i{a, b),a; b) 

a>0 

= inf sup5'(t, a; b). 

a>0 t<Q 

By Lemma A. 6, oq is unique. If bE = b, we have 

g{i{aE,bE),aE;bE) = g{i{aE,b),aE]b) > g{to,ao;b). 
Combining the above facts, we have 

(A.22) inf supg{t,a;bi) > g{io,ao;b). 

Therefore, we have proved that 6i > 0. By (A.20)-(A.22), we have that if 
n>ni, 

P{(X,V^f G {ni{b))-) < exp{-ng{io,ao;b)-n6i/2}. 
The proof is complete. □ 
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